Mining Knowledge fromWikipedia for the Question Answering task

نویسندگان

  • Davide Buscaldi
  • Paolo Rosso
چکیده

Although significant advances have been made recently in the Question Answering technology, more steps have to be undertaken in order to obtain better results. Moreover, the best systems at the CLEF and TREC evaluation exercises are very complex systems based on custom-built, expensive ontologies whose aim is to provide the systems with encyclopedic knowledge. In this paper we investigated the use of Wikipedia, the open domain encyclopedia, for the Question Answering task. Previous works considered Wikipedia as a resource where to look for the answers to the questions. We focused on some different aspects of the problem, such as the validation of the answers as returned by our Question Answering System and on the use of Wikipedia “categories” in order to determine a set of patterns that should fit with the expected answer. Validation consists in, given a possible answer, saying wether it is the right one or not. The possibility to exploit the categories of Wikipedia was not considered until now. We performed our experiments using the Spanish version of Wikipedia, with the set of questions of the last CLEF Spanish monolingual exercise. Results show that Wikipedia is a potentially useful resource for the Question Answering task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grammars for Question Answering Systems Based on Inteligent Text Mining in Biomedicine

-A top-down approach to the determination of grammar engineering methods for the construction of Natural Language Grammars for Question Answering Systems based on Intelligent Text Mining for Biomedical applications is presented in the present paper. Our proposal is to start with the formulation of the ultimate task goal such as specialized question answering and derive from it the specification...

متن کامل

Distributed NLP and Machine Learning for Question Answering Grid

We regard question answering as Semantic Grid application in which the answering process is best expressed as a distributed computing task and show, how the workflow control of this distributed QA task can be learned automatically. Since the control protocol contains information about each resource and service, this information can be mined to reveal semantics about the components. We address t...

متن کامل

Extracting Answers from the Web Using Knowledge Annotation and Knowledge Mining Techniques

Aranea is a question answering system that extracts answers from the World Wide Web using knowledge annotation and knowledge mining techniques. Knowledge annotation, which utilizes semistructured database techniques, is effective for answering large classes of commonly occurring questions. Knowledge mining, which utilizes statistical techniques, can leverage the massive amounts of data availabl...

متن کامل

Extracting Answers from the Web Using Data Annotation and Knowledge Mining Techniques

Aranea is a question answering system that extracts answers from the World Wide Web using knowledge annotation and knowledge mining techniques. Knowledge annotation, which utilizes semistructured database techniques, is effective for answering large classes of commonly occurring questions. Knowledge mining, which utilizes statistical techniques, can leverage the massive amounts of data availabl...

متن کامل

Optimizing question answering systems by Accelerated Particle Swarm Optimization (APSO)

One of the most important research areas in natural language processing is Question Answering Systems (QASs). Existing search engines, with Google at the top, have many remarkable capabilities. But there is a basic limitation (search engines do not have deduction capability), a capability which a QAS is expected to have. In this perspective, a search engine may be viewed as a semi-mechanized QA...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006